home *** CD-ROM | disk | FTP | other *** search
- Path: duck.ibh-dd.de!beck
- From: beck@duck.ibh-dd.de (Andre Beck)
- Newsgroups: comp.lang.c
- Subject: Who's dumb: me or my compiler ?
- Followup-To: poster
- Date: 12 Feb 1996 15:31:57 GMT
- Organization: IBH - Xlink PoP Dresden
- Message-ID: <4fnmhd$c1o@micky.ibh-dd.de>
- NNTP-Posting-Host: duck.ibh-dd.de
- X-Newsreader: TIN [version 1.2 PL1]
-
- Hi,
-
- I could have had a really bad day searching for this bug, but for
- some lucky circumstances, I trapped it instantly. However I don't
- know whether I wrote bad code or whether a number of compilers do
- it wrong in the same way - the latter doesn't sound that probable ;)
-
- My problem turns around the expression
-
- x = (*ptr++ << 8) | *ptr++;
-
- which is pretty simple (IMHO). Read two subsequent bytes from memory
- and make a word off them. Not nice, I know, even endian-dependant. But
- nothing to fail on. Now, GCC fails on 3 different platforms, and DEC
- cc for Ultrix fails as well. Here's my trigger program:
-
- ------------------------- snip -------------------------
- /*
-
- This program produces an IMHO wrong result on a number of compilers and
- different architectures. Either I don't understand C correctly, or several
- compilers (including GNU C) optimize an expression containing two invo-
- cations of an autoincrementing pointer dereference into one dereference,
- treating the dereference as a common subexpression (which it isn't as it
- has a side effect).
-
- */
-
- #include <stdio.h>
- #include <stdlib.h>
-
- void f(const unsigned char *pkt)
- {
- const unsigned char *dat = pkt;
- unsigned short proto;
-
- /* The intention is to read a word at an probably odd address in an
- endian dependent way
- */
- proto = (*dat++ << 8) | *dat++;
-
- printf("%04x\n", proto);
- }
-
- /*
-
- Disassembly of optimized function f on a SPARC 10 looks like:
-
- section .text
- f()
- 10794: 9d e3 bf 90 save %sp, -112, %sp
- 10798: d2 0e 00 00 ldub [%i0], %o1
- ^^^^^^^^^^^^^^^^^^^^^^^
- Loads the first byte from pointer
-
- 1079c: 11 00 00 42 sethi %hi(0x10800), %o0
- 107a0: 90 12 20 e8 or %o0, 232, %o0
- 107a4: 95 2a 60 08 sll %o1, 8, %o2
- ^^^^^^^^^^^^^^^^^^^^^^^^
- Shiftleft of this value is Ok
-
- 107a8: 40 00 40 8a call printf
- 107ac: 92 12 40 0a or %o1, %o2, %o1
- ^^^^^^^^^^^^^^^^^^^^^^^^^^
- The OR would be Ok, but %o1 was never
- fetched from memory, it's just reused.
- Is the optimizer taking *dat++ for a
- common subexpression (which it isn't) ?
-
- 107b0: 81 c7 e0 08 ret
- 107b4: 81 e8 00 00 restore
- */
-
- /*
-
- Disassembly on a MIPS reveals the exactly same behavior:
-
- f:
- [gccbug.c: 88] 0x4001b0: 27bdffe8 addiu sp,sp,-24
- [gccbug.c: 87] 0x4001b4: afbf0010 sw ra,16(sp)
- [gccbug.c: 87] 0x4001b8: 90850000 lbu a1,0(a0)
- ^^^^^^^^^^^^^^^^
- Fetch the first byte
-
- [gccbug.c: 89] 0x4001bc: 27848010 addiu a0,gp,-32752
- [gccbug.c: 95] 0x4001c0: 00051200 sll v0,a1,8
- [gccbug.c: 90] 0x4001c4: 0c10013c jal printf
- [gccbug.c: 91] 0x4001c8: 00a22825 or a1,a1,v0
- ^^^^^^^^^^^^^^^^
- Seems v0 is treatened as a
- common subexpression
-
- [gccbug.c: 120] 0x4001cc: 8fbf0010 lw ra,16(sp)
- [gccbug.c: 92] 0x4001d0: 27bd0018 addiu sp,sp,24
- [gccbug.c: 96] 0x4001d4: 03e00008 jr ra
- [gccbug.c: 93] 0x4001d8: 00000000 nop
-
- */
-
- /*
-
- And now the Dessert: x86 code as generated on Linux. Here the Bug
- triggers even when not optimizing at all. Probably this processor has
- not enough registers to allow a compiler to _not_ optimize...
-
- 00001088 <_f> pushl %ebp
- 00001089 <_f+1> movl %esp,%ebp
- 0000108b <_f+3> movl 0x8(%ebp),%eax
- 0000108e <_f+6> movzbw (%eax),%dx ; fetching first byte
- 00001092 <_f+a> movl %edx,%eax
- 00001094 <_f+c> shlw $0x8,%ax ; shifting it up
- 00001098 <_f+10> orw %dx,%ax ; and never fetching the second one
- 0000109b <_f+13> andl $0xffff,%eax
- 000010a0 <_f+18> pushl %eax
- 000010a1 <_f+19> pushl $0x1078
- 000010a6 <_f+1e> call 60000a00 <_printf>
- 000010ab <_f+23> movl %ebp,%esp
- 000010ad <_f+25> popl %ebp
- 000010ae <_f+26> ret
-
- int main()
- {
- unsigned short bla = 0x6622;
- unsigned char *buf = (unsigned char *) &bla;
-
- printf("This program should generate either 6622 or 2266 as output:\n");
-
- f(buf);
-
- return(0);
- }
- ---------------------------- snap -----------------------------
-
- If anyone out there can tell me what's wrong here, I would be pleased.
- Please answer by email, I dont have enough time to follow this high
- traffic group in detail. Instead, I'll post a summary if it makes any
- sense.
-
- Thanks,
- Andre.
- --
- +-o-+--------------------------------------------------------+-o-+
- | o | \\\- Brain Inside -/// | o |
- | o | ^^^^^^^^^^^^^^ | o |
- | o | Andre' Beck (ABPSoft) beck@ibh-dd.de XLink PoP Dresden | o |
- +-o-+--------------------------------------------------------+-o-+
-